Newest 'algorithm reinforcement-learning' Questions

2votes

1answer

1kviews

Factors that affect the number of iterations of value iteration

I had an assumption that value iteration will take more iterations to converge if the map size increases/environment's complexity increases. I tried to verify this idea by running value iteration on ...

john li

23

asked Mar 27, 2022 at 3:03

0votes

1answer

125views

What reinforcement learning algorithm should I use in continuous states?

I want to use reinforcement learning in an environment I made. The exact environment doesn't really matter, but it comes down to this: The amount of different states in the environment is infinite e.g....

SirPVP

3

asked Jul 2, 2020 at 0:06

1vote

1answer

136views

In RL, if I assign the rewards for better positional play, the algorithm is learning nothing?

I'm creating an RL application for the game Connect Four. If I tell the algorithm which moves/token positions will receive greater rewards, surely it's not actually learning anything; it's just a ...

mason7663

653

asked Apr 4, 2020 at 13:21

1vote

0answers

66views

Action spaces for an RTS game

I think reinforcement learning would be a good fit for this problem, but I am not sure of how to deal with a seemingly infinite number of actions. In the beginning of each game (generic RTS game), the ...

Quaxton Hale

111

asked May 21, 2019 at 18:44

3votes

1answer

9kviews

What is the time complexity of the value iteration algorithm?

Recently, I have come across the information (lecture 8 and 9 about MDPs of this UC Berkeley AI course) that the time complexity for each iteration of the value iteration algorithm is $\mathcal{O}(|S|^...

Shifat E Arman

83

asked Nov 17, 2018 at 13:46

18votes

4answers

3kviews

Why does the discount rate in the REINFORCE algorithm appear twice?

I was reading the book Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto (complete draft, November 5, 2017). On page 271, the pseudo-code for the episodic Monte-Carlo ...

Diego Orellana

373

asked Aug 22, 2018 at 18:06

2votes

0answers

94views

Which features and algorithm could optimize this air-conditioner problem?

Imagine we have 2 air conditioner systems (AA) and 2 "free cooling" systems which mix external and internal air (FC) in a closed box which always tends to warm up. For each system, we have to find ...

freesoul

246

asked Jan 9, 2018 at 18:26

2votes

1answer

720views

What algorithm should I use to classify documents?

I'd like to build a program that would learn to automatically classify documents. The principle would be that, for each new document I add to the system, it would automatically infer in which category ...

Charles Brunet

137

asked Feb 4, 2017 at 16:41

Stack Exchange Network

All Questions

Factors that affect the number of iterations of value iteration

What reinforcement learning algorithm should I use in continuous states?

In RL, if I assign the rewards for better positional play, the algorithm is learning nothing?

Action spaces for an RTS game

What is the time complexity of the value iteration algorithm?

Why does the discount rate in the REINFORCE algorithm appear twice?

Which features and algorithm could optimize this air-conditioner problem?

What algorithm should I use to classify documents?

Hot Network Questions

All Questions

Related Tags